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In 1975 John Tukey propos ed a multivari ate median which is the 'deepest' point 
in a given data cloud in W 1 (ITukevll 1975b . In measuring the depth of an arbitrary 
point z with respect to the data, he considered hyperplanes through z and determined 
its 'depth' by the smallest portion of data that are separated by such a hyperplane. 
Since then, Tukey's idea has proved extremely fruitful. A rich statistical method- 
ology has developed that is based on data depth and, more general, nonparametric 
depth statistics. General notions of data depth have been introduced as well as many 
special ones. These notions vary regarding their computability and robustness and 
their sensitivity to reflect asymmetric shapes of the data. According to their different 
properties they fit to particular applications. The upper level sets of a depth statistic 
provide a family of set-valued statistics, named depth-trimmed or central regions. 
They describe the distribution regarding its location, scale and shape. The most cen- 
tral region serves as a median. The notion of depth has been extended from data 
clouds, that is empirical distributions, to general probability distributions on W 1 , 
thus allowing for laws of large numbers and consistency results. It has also been ex- 
tended from rf-variate data to data in functional spaces. The present chapter surveys 
the theory and methodology of depth statistics. 



Recen t review s on data depth are given in ICascosI d2009l) and lSerflingl fcOOdl . 
Liu et al.1 (120061) collects theoretical as well as a pplied work. More on th e theory 
of depth funct ions and many details are found in Zuo and Serflind ( 2000l) and the 
monograph bv lMoslerl (120011) . 
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First I provide a general notion of depth functions, which relies on a set of desirable 
properties; then a few variants of the properties are discussed. A depth function in- 
duces an outlyingness function and a family of central regions. Further, a stochastic 
ordering and a probability metric are generated. 



1.1 Postulates on a depth statistic 

Let £ be a Banach space, SB its Borel sets in E, and & a set of probability distribu- 
tions on 8$. To start with and in the spirit of Tukey's approach to data analysis, we 
may regard & as the class of empirical distributions giving equal probabilities i to 
n, not necessarily different, data points in E = M. d . 

A depth function is a function D:£x^^[0,l], (z,P) i— > D(z\P), that satisfies 
the restrictions (or 'postulates') Dl to D5 given below. For easier notation 1 write 
D(z\X) in place of D(z\P), where X is an arbitrary random variable distributed as P. 
For z £ E, P £ 2?, and any random variable X having distribution P it holds: 

• Dl Translation invariant: D(z + b\X + b) = D(z\X) for all b £ E . 

• D2 Linear invariant: D(Az\AX) = D(z\X) for every bijective linear transforma- 
tion A : E ->• E . 

• D3 Null at infinity: lim|| zHoo D(z|X) = . 

• D4 Monotone on rays: If a point z* has maximal depth, that is D(z*\X) = 
max ze ED(z\X) , then for any r in the unit sphere of E the function a i-> D(z* + 
ar\X) decreases, in the weak sense, with a > . 

• D5 Upper semicontinuous: The upper level sets D a (X) = {zEE : D(z\X) > a} 
are closed for all a . 

Dl and D2 state that a depth function is affine invariant. D3 and D4 mean that 
the level sets D a , a > 0, are bounded and starshaped about z*. If there is a point 
of maximum depth, this depth will w.l.o.g. be set to 1. D5 is a useful technical 
restriction. An immediate consequence of restriction D4 is the following: 

Proposition 1. IfX is centrally symmetric distributed about some z* £ E, then any 
depth function D(-\X) is maximal at z*. 

Recall that X is centrally symmetric distributed about z* if the distributions of X — z* 
and z* — X coincide. 

Our definition of a depth function differs slightly from that given in iLiu ll 19901) 



and lZuo and Serflind(l2000l) . The main difference between these postulates and ours 



is that they additionally postulate PropositionQ~]to be true and that they do not require 
upper semicontinuity D5. 

D4 states that the upper level set D a (x l ,...,x n ) are starshaped with respect to z*. 
If a depth function, in place of D4, meets the restriction 
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• D4con: D(-\X) is a quasiconcave function, that is, its upper level sets D a (X) are 
convex for all a > , 

the depth is mentioned as a convex depth. Obviously, as a convex set is starshaped 
with respect to each of its points, D4con implies D4. In certain settings the restric- 
tion D2 is weakened to 

• D2iso: D(Az\AX) = D(z\X) for every isometric linear transformation A : E — > 
E. 

Then, in case E = Mr, D is called an orthogonal invariant depth in contrast to an 
affine invariant depth when D2 holds. Alternatively, sometimes D2 is attenuated to 
scale invariance, 

• D2sca: D(Az|AX) = D(z\X) for all A > . 



1.2 Central regions and outliers 



For given P and < a < 1 the level sets D a (P) form a nested family of depth- 
trimmed or central regions. The innermost region arises at some a max < 1, which in 
general depends on P. D amax (P) is the set of deepest points. Dl and D2 say that the 
family of central regions is affine equivariant. Central regions describe a distribution 
X with respect to location, dispersion, and shape. This has many applications in 
multivariate data analysis. On the other hand, given a nested family {C a (P)} ae[0,i] 
of set-valued statistics, defined on that are convex, bounded and closed, the 
function D, 

D{z\P) = su P {a:zeC a (P)}, zeE,Pe<?, (1) 

satisfies Dl to D5 and D4con, hence is a convex depth function. 

A depth function D orders data by their degree of centrality. Given a sample, it 
provides a center-outward order statistic. The depth induces an outlyingness func- 
tion W' -> [0,°o[by 

Out(z\X) = /, s - 1 . 
D{z\X) 

which is zero at the center and infinite at infinity. In turn, D(z\X) = (1 +Out (z\X))~ l . 
Points outside a central region D a have outlyingness greater than I /a — 1; they can 
be regarded as outliers of specified level a. 



1.3 Depth lifts, stochastic orderings, and metrics 



Assume a max = 1 for P £ £P. By adding a real dimension to the central regions 
D a (P),a £ [0, 1], I construct a set, which will be mentioned as the depth lift, 

D(P) = {(a,y) £ [0,1] x E : y = ax, xeD a (P), a £ [0,1]}. (2) 
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The depth lift gives rise to an ordering of probability distributions in 3P\ P <uQ 

if 

D(P)cD(Q). (3) 

The restriction D(P) C D(Q) is equivalent to D a (P) C D a {Q) for all a. Thus, 
P <dQ means that each central set of Q is larger than the respective central set of 
P. In this sense, Q is more dispersed than P. The depth ordering is antisymmetric, 
hence an order, if and only if the family of central regions completely character- 
izes the underlying probability. Otherwise it is a preorder only. Finally, the depth D 
introduces a probability semi-metric on 3? by the Hausdorff distance of depth lifts, 

S D (P,Q) = 8 H (D(P),D(Q)). (4) 

Recall that the Hausdorff distance 8h{C\,C'i) of two compact sets C\ and C2 is 
the smallest £ such that C\ plus the e-ball includes C2 and vice versa. Again, the 
semi-metric is a metric iff the central regions characterize the probability. 



2 Multivariate depth functions 

Originally and in most existing applications depth statistics are used with data in 
Euclidean space. Multivariate depth statistics are particularly suited to analyze non- 
gaussian or, more general, non-elliptical distributions in M. d . Without loss of gen- 
erality, I consider distributions of full dimension d, that is, whose convex hull of 
support, co(P), has affine dimension d. 

A random vector X in W 1 has a spherical distribution if AX is distributed as X for 
every orthogonal matrix A. It has an elliptical distribution if X = a + BY for some 
a e W 1 , B e W lxd , and spherically distributed Y; then we write X - Ell(a,BB', <p), 
where <p is the radial distribution of Y. Actually, on an elliptical distribution P = 
Ell(a,BB', <p), any depth function D(-,P) satisfying Dl and D2 has parallel elliptical 
level sets D a (P), that is, level sets of a quadratic form with scatter matrix BB'. 
Consequently, all affine invariant depth functions are essentially equivalent if the 
distribution is elliptical. Moreover, if P is elliptical and has a unimodal Lebesgue- 
density //>, the density level sets have the same elliptical shape, and the density is a 
transformation of the depth, i.e., a function q> exists such that fp(z) = <p(D(z\P) for 
all z € Similarly, on a spherical distribution, any depth satisfying postulates Dl 
and D2iso has analogous properties. 

In the sequel, I consider three principal approaches to define a multivariate depth 
statistic. The first approach is based on distances from properly defined central 
points or on volumes, the second on general L-statistics, that weighted means of 
order statistics, the third on simplices and halfspaces in Mr. The three approaches 
have different consequences on the depths' ability to reflect asymmetries of the dis- 
tribution, on their robustness to possible outliers, and on their computability with 
higher-dimensional data. 
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Figures [TJ to |4] below exhibit bivariate central regions for several depths and 
equidistant a. The data consist of the unemployment rate (in %) and the GDP share 
of public debt for the countries of the European Union in 201 1. 

Most of the multivariate depths considered are convex and affine invariant, some 
exhibit spherical invariance only. Some are continuous in the point z or in the dis- 
tribution P (regarding weak convergence), others are not. They differ in the shape 
of the depth lift and whether it uniquely determines the underlying distribution. 
A basic dispersion ordering of multivariate probability distributions serving as a 
benchmark is the dilation order, which says that Y spreads out more than X i f 
E[<p(X)] < E[<p(y)] holds for every convex <p : W 1 -> R; see, e.g. iMosler! feOOlh . 
It is interesting whether or not a particular depth ordering is concordant with the 
dilation order. 



2.1 Depths based on distances 

The outlyingness of a point, and hence its depth, can be measured by a distance 
from a properly chosen center of the distribution. In the following notions this is 
done with different distances and centers. 

L2-depth. The L2-depth, D Ll , is based on the mean outlyingness of a point, as 
measured by the Li distance, 



It holds a max = 1. The depth lift is D L2 (X) = {(a,z) :E\\z-aX\\ < 1 - a} and 
convex. For an empirical distribution on points x' ,i= 1 ,...,«, we obtain 



Obviously, the L2-depth vanishes at infinity (D3), and is maximum at the spatial me- 
dian. (The spatial median of X is the point z E M. d that minimizes E| \z — X\\.) If the 
distribution is centrally symmetric, the center is the spatial median, hence the max- 
imum is attained at the center. Monotonicity with respect to the deepest point (D4) 
as well as convexity and compactness of the central regions (D4con, D5) derive im- 
mediately from the triangle inequality. Further, the L2-depth depends continuously 
on z. The L2-depth converges also in the probability distribution: For a uniformly 
integrable and weakly convergent sequence P n — > P it holds lim„ D(z\P„) = D(z\P). 

However, the ordering induced by the L2-depth is no sensible ordering of disper- 
sion, since the L2-depth contradicts the dilation order. As \ \z — x\ | is convex in x, the 
expectation E| \z — X\ \ increases with a dilation of P. Hence (0 decreases (!) with a 
dilation. 

The L2-depth is invariant against rigid Euclidean motions (Dl, D2iso), but not 
affine invariant. An affine invariant version is constructed as follows: Given a posi- 



D L *{z\X) = (1+E||z-X||) 



(5) 




(6) 
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tive definite d x d matrix M consider the M-norm, 



\\z\\m = Vz!M-h, zem. d . (7) 

Let Sx be a positive definite dxd matrix that depends continuously (in weak conver- 
gence) on the distribution and measures the dispersion of X in an affine equivariant 
way. The latter means that 

SxA+b = ASxA' holds for any matrix A of full rank and any b. (8) 

Then an affine invariant L2- depth is given by 

(l+E\\z-X\\ Sx )- 1 . (9) 

Besides invariance, it has the same pro perties as the L?-depth. A simple choice for 



Besides invariance, it has the same properties as the L2-acpth. A simple choice to 
Sx is the covariance matrix Ex of X dZuo and Serfling , 2000l) . Note that the co 



variance matrix is positive definite, as the convex hull of the support, co{P), is as- 
sumed to have full dimension. More robust choices for Sx are the minimum volume 
ellipsoid (MVE) or th e mini mum covariance determinant (MCD ) estimators; see 
iRousseevjw and Lerovl dl987l) and lLopuhaa and Rousseeuwl dl99ll) . 




I f 1 1 1 1 

50 100 150 50 100 150 

Fig. 1 Public debt in % of GDP (x-axis) and unemployment rate in % (y-axis) of European Union 
countries in 201 1; Mahalanobis regions (moment, left; MCD, right) 



Mahalanobis depths. Let cx be a vector that measures the location of X in a 
continuous and affine equivariant way and, as before, Sx be a matrix that satisfies 
([8]l and depends continuously on the distribution. Based on the estimates cx and Sx 
a simple depth statistic is constructed, the generalized Mahalanobis depth, given by 

D Muh {z\X) = {\ + \\z-c x \\ 2 Sx y i . (10) 

Obviously, <fl~0b satisfies Dl to D5 and D4con, taking its unique maximum at cx - The 
depth lift is the convex set D Mah (X) = {{a,z) : ||z- acx \\j < a 2 (a- 1)}, and the 
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central regions are ellipsoids around cx\ see FigureQ] The generalized Mahalanobis 
depth is continuous on z and P. In particular, with cx = E[X] and Sx = Ex the 
(moment) Mahalanobis depth is obtained, 

D mMah (z\X) = (l + (z-E[X}yEx 1 (z-B[X})Y 1 . (11) 
Its sample version is 

D Mah (z\x\..., X ")=(l + (z-xyE- 1 (z-x)y\ (12) 

where x is the mean vector and Ex is the empirical covariance matrix. It is easily 
seen that the a-central set of a sample from P converges almost surely to the a- 
central set of P, for any a. 

Concerning uniqueness the Mahalanobis depth fails in identifying the underly- 
ing distribution. As only the first two moments are used, any two distributions which 
have the same first two moments cannot be distinguished by their Mahalanobis depth 
functions. Similarly, the generalized Mahalanobis depth does not determine the dis- 
tribution. However, within the family of nondegenerate c/-variate normal distribu- 
tions or, more general, within any affine family of nondegenerate cZ-variate distribu- 
tions having finite second moments, a single contour set of the Mahalanobis depth 
suffices to identify the distribution. 




I i i i i 

50 100 150 50 100 150 

Fig. 2 Public debt and unemployment rate; projection depth regions (left), Oja regions (right) 



Pro jection depth The projection depth (Figure|2]i has been proposed in Zuo and Serflind 
d2000h . 

^ m= f 1+! „ p Mz^*a)" ( (13) 

V P es"-i Dmed((p,X» J 

where S d ~ l denotes the unit sphere in W 1 , (p,z) is the inner product, that is the 
projection of z on the line {Xp : A e R}, med(C7) is the usual median of a univariate 
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random variable U, and Dmed(f/) = med(|I/ — med(I/)| is the median absolute 
deviation from the median. The projection depth satisfi es Dl to D5 and D4con. 
It has good properties, which are discussed in detail by IZuo and Serfling (120001) 
For br eakdown properties of the employed location and scatter statistics, see IZuo 
(bOOOh . 

Oja depth: The Oja depth is not based on distances, but on ave rage volumes of 
simplices that have vertices from the data ( Zuo and SerflingL 200d) : see Figure |2] 



2.2 Weighted mean depths 

A large and flexible class of depth stati stics corresponds to so called weighte d-mean 



central regions, shortly WM regions dDvckerhoff and Mosleii 1201 ll 120121) . These 



are convex compacts in R , whose support function is a weighted mean of order 
statistics, that is, an L-statistic. Recall that a convex compact K C M.' 1 is uniquely 
determined by its support function h,K, 



h K {p) = max [p'x :x£K}, p£S d 1 . 

To define the WM a -region of an empirical distribution on x l 7 x 2 , . . . , x", I construct 
its support function as follows: For p £ S d ~\ consider the line {Xp £ R : X £ R}. 
By projecting the data on this line a linear ordering is obtained, 

p ' x x P (\) < p'xMV <■■■< p'-fpi") , (14) 

and, by this, a permutation % p of the indices 1,2, ... ,n. Consider weights Wj. a for 
j £ {1,2, . . . ,«} and a G [0, 1] that satisfy the following restrictions (;) to (Hi): 

(0 YJj=\ w j-a = 1' w j.a > for all j and a . 
(H) Wj,a increases in j for all a . 

(iii) a<fi implies Y!j=i w j,a <Lj=i w j,p , k=l,...,n. 



Then, as it has been shown in lDyckerhoff and Moslerl(l201 lb . the function h Da ^ x 



h Daix i x n) (p) = £ W j,ap'x K " U) , p e S' 1 - 1 , (15) 

7=1 

is the support function of a convex body D a = D a (x l ,x"), and D a C Dp holds 
whenever a > j5. Now we are ready to see the general definition of a family of WM 
regions. 

Definition 1. Given a weight vector w a = w\ a , . . . w„ t0t that satisfies the restrictions 
(i) to (in), the convex compact D a = D a (x l , . . . ,x n ) having support function ([13] ) is 
named the WM region of x l , . . . ,x" at level a , a £ [0,1]. The corresponding depth 
(03 is the WM depth with weights w a , « £ [0, 1]. 



Depth statistics 



9 



It follows that the WM depth satisfies the restrictions Dl to D5 and D4con. More- 
over, it holds 

D a (x 1 , . . . ,x") = conv I w ./',a* : ft permutation of {1, ... ,«} > . (16) 

U=i J 

This explains the name by stating that a WM region is the convex hull of weighted 
means of the data. Consequently, outside the convex hull of the data the WM depth 
vanishes. WM depths are useful statistical tools as their central regions have attrac- 
tive analytical and computational properties. Sample WM regions are consistent es- 
timators for the WM region of the underlying probability. Besides being continuous 
in the distribution and in a, WM regions are subadditive, that is, 

D a (x 1 +y\...,x n +y")CD a ( X \...y)(BD a (y\...,y"), 

and monotone: If x' < y' holds for all ; (in the componentwise ordering of W 1 ), then 

D a (y\...,f) GD a (x\...,x")(BR d + , and 
D a (x\...,x n )cD a (y\...y)®R d _, 

where © signifies the Minkowski sum of sets. 

Depending on the choice of the weights Wj & different notions of data depths are 
obtained. For a detailed discussi on of these and other special WM dep ths and central 



regions, the reader is referred to lDyckerhoff and Moslerl (1201 11120121) 




50 100 150 50 100 150 

Fig. 3 Public debt and unemployment rate; zonoid regions (left), ECH* regions (right). 



Zon oid dep th. For an emp irical distribution P on x 1 , 
dKoshevoy and Moslerl 1 19971) 



. ,x" and < a < 1 define 



B% n (p)=\iM--o<**<±-,i*i 

fri na H 



10 Karl Mosler 

The corresponding support function < fT3T > employs the weights 

if j < n - \ na\ , 



na—\na\ 
na 

na if J > n ~ L»« 



na tij=n-\na], (17) 
l 



See Figure[3] Many propert ies of zonoid regions and the zonoid depth D z " n (z\X) are 
discussed in iMoslerl (120021) . The zonoid depth lift equals the so called lift zonoid, 
which fully characterizes the distribution. Therefore the zonoid depth generates 
an antisymmetric depth order © and a probability metric Zonoid regions are 
not only invariant to affine, but to general linear transformations; specifically any 
marginal projection of a zonoid region is the zonoid region of the marginal distribu- 
tion. The zonoid depth is continuous on z as well as P. 

Expected convex hull depth. An other importan t notion of WMT depth is that of 



expected convex hull (ECH*) depth dCascosl 120071) : see Figure|3] Its central region 



D a has a support function with weights 

j l/a_ U _ l y/a 



„l/« 

Geometrical depth. The weights 



= - w„ ■ (18) 



v j,a 



J T^ a "~ J if0< a < 1 
if a = 1 , 



yield another class of WM regions. The respective depth is the geometrically 
weighted mean depth. 



2.3 Depths based on half spaces and simplices 

The third approach concerns no distances or volumes, but the combinatorics of half- 
spaces and simplices only. In this it is independent of the metric structure of W 1 . 
While depths that are based on distances or weighted-means may be addressed as 
metric depths, the following ones will be mentioned as combinatorial depths. They 
remain constant, as long as the compartment structure of the data does not change. 
By this, they are very robust against location outliers. Outside the the convex sup- 
port co(X) of the distribution every combinatorial depth attains its minimal value, 
which is zero. 
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50 100 150 50 100 150 

Fig. 4 Public debt and unemployment rate; Tukey regions (left), simplicial regions (right). 



Location depth. Consider the population version of the location depth, 



D loc (z\X) = M{P(H) : H a closed halfspace, z£H}. 



(19) 



The depth is also known as halfspace or Tukey depth, its central regions as Tukey 
regions. 

The location depth is affine invariant (Dl, D2). Its central regions are convex 
(D4con) and closed (D5); see Figure [4] The maximum value of the location depth 
is < 1 depending on the distribution. The set of all such points is mentioned a s the 
halfspace median set and each of its elements as a Tukey median dTukevlll975h . 

If X has an angular symmetric distribution, the location depth attains its max- 
imum at the center and the center is a Tukey median, which strengthens Propo- 
sition Q] (A distribution is called angular (= halfspace) symmetric about z* if 
P(X £ H) > 1/2 for every closed halfspace H having z* on the boundary; equiv- 
alently, if (X — z*) / 1 \X — z* 1 1 is centrally symmetric with the convention 0/0 = 0.) 

If X has a Lebesgue-density, the location depth depends continuously on z; oth- 
erwise the dependence on z is noncontinuous and there can be more than one point 
where the maximum is attained. As a function of P the location depth is obviously 
noncontinuous. It determines the distribution in a unique way if the distribution is 
either discrete ( IStruvf and Rousseeuw , 1 19991: Koshevov, 2002) or continuous with 
compact support. The l ocation depth of a sample fr om P converges almost surely to 
the location depth of P ( Donoho and Gaskoi 1992 ). The next depth notion involves 
simplices in W 1 . 

Simplicial depth. iLiu (1990) defines the simplicial depth as follows: 



D sim (z\X)=P( Z Gco({X h ...,X d+1 })), 



(20) 



where X\ , . . . ,X c / + i are i.i.d. by P. The sample version reads as 

D sim {z\x\...,x n ) = 7 | T #{{/!,..., i d+l } :zec ({A...,*' d+1 })}. (2D 
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The simpiiciai depth is affine invariant (Dl, D2). It has maximum < 1 depending 
on the distribution. In general, the point of maximum simpiiciai depth is not unique; 
the simpiiciai median is defined as the gravity center of these points. The sample 
simpiiciai depth converges a lmost surely uniformly in z to its population version 



Liulll990t iDumbgeiu 119921) . The simpiiciai depth has positive breakdown (Chen, 



1995). 



If the distribution is Le besgue-con tinuous, the simpiiciai depth behaves well: It 
varies continuously on z dLiu , ll990L Th. 2), is maximum at a center of angular 
symmetry, and decreases monotonously from a deepest point (D4). The simpiiciai 
central reg ions of a Lebesgue-continuous distribution are connected and compact 
(HTulll990l) . 

However, if the di stribution is discrete, eac h of these properties can fail; for coun- 
terexamples see, e.g. JZuo and Serflingl (120001) . The simpiiciai depth characterizes an 
empirical measure if the supporting points are in general position, that is, if no more 
than d of the points lie on the same hyperplane; see Figure [4] 

Ot her well known combinatorial data depths are the majority depth jLiu and Singh , 
1993 ) and the convex-hull peeling depth ( Barnettl 1 1 9761 : iDonoho and Gaskol 1 1 992h . 



However the latter possesses no population version. 



3 Functional data depth 

The ana lysis of functional data has bec ome a practically important branch of statis- 



tics; see lRamsav and Silvermanl (120051) . Consider a space E of functions [0, 1] — >• R 
with the supremum norm. Like a multivariate data depth, a functional data depth 
is a real-valued functional that indicates how 'deep' a function z G E is located in 
a given finite cloud of functions G E. Let E' denote the set of continuous linear 
functionals E — > R, and E' d the af-fold Cartesian product of E' . Here, following 



Mosler and Polyakoval ( 2012 ). functional depths of a general form d22l are pre- 



sented. Some alternative approaches will be addressed below. 

<£>-depth. For z G E and an empirical distribution X on x \ . . . ,x" G E, define a 
functional data depth by 

D{z\X)= mfD d ((p(z)\(p(X)), (22) 
<pe<P 

where U 1 is a of-variate data depth satisfying Dl to D5, <P C E' a ' , and (p{X) the em- 
pirical distribution on (p(x ),..., (p(x n ). D is called a <P-depth. A population version 
is similarly defined. 

Each <p in this definition may be regarded as a particular 'aspect' we are inter- 
ested in and which is represented in cZ-dimensional space. The depth of z is given as 
the smallest multivariate depth of z under all these aspects. It implies that all aspects 
are equally relevant so that the depth of z cannot be larger than its depth under any 
aspect. 
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As the rf-variate depth Tr has maximum < 1, the functional data depth D is 
bounded above by 1. At every point z* of maximal D-depth it holds D(z*\X) < 1. 
The bound is attained with equality, D(z*\X) = 1, iff D d ((p(z*)\(p{X)) = 1 holds for 
all q> £<P, that is, iff 

z*G f)<p-\Di(<p(X))). (23) 
<pe<t> 

A <f>-depth d23 always satisfies Dl, D2sca, D4, and D5. It satisfies D3 if for 
every sequence (z l ) with \ \z'\ \ — > °° exists a <p in <P such that <p(z') — > °° . D4con is 
met if D4con holds for the underlying cZ-variate depth. 

I now proceed with specifying the set <P of functionals and the multivariate depth 
D k in d22>. While many features of the functional data depth d22l resemble those of a 
multivariate depth, an important difference must be pointed out: In a general Banach 
space the unit ball B is not compact, and properties D3 and D5 do not imply that the 
level sets of a functional data depth are compact. So, to obtain a meaningful notion 
of functional data depth of type d22l one has to carefully choose a set of functions 
<P that is not too large. On the other hand, <P should not be too small, in order to 
extract sufficient information from the data. 

Graph depths. For x £ E notate x(t ) = (x\ {t) ... ,Xd(t)) and consider 



4> = {(p' :E 



(p , (x) = (x 1 (t)...,x d (t)),t€T} 



(24) 



for some T C [0, 1], which may be a subinterval or a finite set. For D d use any 
multivariate depth that satisfies Dl to D5. This results in the graph depth 



GD(z\x l 



,x") = MD"(z(t)\x 1 (t),...,x n (t)). 



(25) 



In particular, with the univariate halfspace dep th, d = 1 and T = J we obtain the 
half graph depth ( Lopez-Pintado and Romol 2005b. Also, wi th the univariate simpli- 
cial depth the band depth (ILopez-Pintado and Romol. 120091) is obtained, but this, in 
general, violates monotonicity D4. 

Grid depths. We choose a finite number of points in J, t\,... : t^, and evaluate a 
function z G E at these points. Notate t = (h,... ,tk) and z(t_) = (zi (f), . . . ,z ( /(f)) T - 
That is, in place of the function z the k x d matrix zW is considered. A grid depth 
RD is defined by (l22l with the following <P, 



<P = {<p r : cp r (z) = «r, Zl (t_)), . . . , (r, Zd (t))),r G S k - 1 } . 



(26) 



which yields 



RD(z\x l , . . . = inf D d ((r,z(t_ )) | {r,x l (/)),. . . , <r,x"(?)» • (27) 



2012) 



A slight extension of the <J>-depth is the principal components depth dMosler and Polvakova , 
However, certain approaches from the literature are no ^-depths. These 



are mainly of two types. The firs t type e mploys random projections of the data: 
Cuesta-Albertos and Nieto-Revesl (120081)1) define the depth of a function as the 
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univariate depth of the function values taken at a randomly chosen argument t. 



Cuevas et al.l d2007l) also emp loy a random projection m ethod. The other type 



uses average univariate depths. iFraiman and Munia (120011) calculate the univariate 
depths of the values of a function and integrate them ove r the whole interval; this 
results in kind of 'average' depth. IClaeskens et al.l ( 12.012b introduce a multivariate 
(d > 1) functional data depth, where they similarly compute a weighted average 
depth. The weight at a point reflects the variability of the function values at this 
point (more precisely: is proportional to the volume of a central region at the point). 



3.1 Computation of depths and central regions in 



The moment Mahalanobis depth and its elliptical central regions are obtained in any 
dimension by calculating the mean and the sample covariance matrix, while robust 
Mahalanobis depths and regions are determined with the R-procedures "cov.mcd" 
and "cov.mve". In dimension d = 2 the central regions of many depth notion s can 
be exactly calculated by following a circular sequence dEdelsbrunnerl 1 1987b . The 
R-package "depth" computes the exact location (d = 2,3) and simplicial (d = 2) 
depths, as well as the Oja depth and an approximative location depth for any di- 
m ension. An exact alg o rithm for the location depth in any dimens ion is developed 
in iLiu and Zuol ( 2012 ). Cuesta-Albertos and Nieto-Reyes ( 2008ah propose to cal- 
culate instead the random Tukey depth, which is the minimum univariate location 
depth of univari ate projections in a number of randomly chosen directions. With 
the algorithm of iPaindaveine and Simanl ( 12012b Tukey regions are obtained, d > 2. 
The bivariate projectio n depth is compu ted by the R-package "ExPD2D"; for the 



respective regions, see Liu et alj d201 1 ). The zonoi d depth can be efficiently de 



termined in any dimension ( Dvckerhoff et all 1 19961) . R-package ("W MTregions") 



exists f or the exact calculation of zon oid and general WM regions; see lMosler et al 
d2009h : iBazovkin and Moslerl 02012b . The R-package "rainbow" calculates several 



functional data depths. 



4 Applications 

Depth statistics have been used in n umerous and diverse tasks of which I can 
mention a few only. ILiu et al. (1999) provide an introduction to some of them. 



In descriptive multivariate analysis, depth functions and central regions visualize 
the data regarding location, scale and shape. By bagplots and sunburst plots out- 
liers can be identified and treated in an interactive way. In £-class supervised clas- 
sification, each - possibly high-dimensional - data point is represented in [0, 1]^ 
by its values of depth in the k given classes, and classification is done in [0, 1]*, 
Functions of depth statistics include depth-weighted statistical functionals, such 
as J m j xw(D(x\P))dP / f M d w(D(x\P))dP for location. In inference, tests for good- 



Depth statistics 



15 



ness of fit and homoge neity regarding loca t ion, scale and symmetry are based on 
depth statistics; see, e.g. lDvckerhoffl ( l2002h : lLev and Paindaveind(2011). A pplica- 
tions include such diverse fields as statistical co ntrol ( ILiu and Singhl 1 1993b . mea- 
surement of risk dCascos and Molchanov , 2007 ). and robust linear programming 
( Bazovkin and Moslei , 201 lb . Functional data depth is applied to similar t asks i n 
description, c lassifi cation and testing; see e.g. Lopez -Pintado and Romol ( 2009 ): 



Cuevasetal. (2007). 



5 Conclusions and outlook 

This survey has covered the fundamentals of depth statistics for li-variate and func- 
tional data. Several special depth functions in W 1 have been presented, metric and 
combinatorial ones, with a focus on the recent class of WM depths. For functional 
data, depths of infimum type have been discussed. Of course, such a survey is nec- 
essarily incomplete and biased by the preferences of the author. Of the many ap- 
plications of depth in the literature only a f ew have been touched, and im portant 



theoretic al extensions l ike regression depth (IRoussee uw and Hube rt). 1 19991) . depth 
calculus dMizeral 2002b. lo cation-scale depth ( Mizera and Mullen, 20041) . and like- 



lihood depth dMiilleri 12005b have been completely omitted. 

Most important in the selection of a depth statistic in multivariate analysis (and 
of D d in functional data analysis) are the questions of computability and - depending 
on the data situation - robustness. Mahalanobis depth is solely based on estimates 
of the mean vector and the covariance matrix. In its classical form with moment 
estimates Mahalanobis depth is efficiently calculated but highly non-robust, while 
with estimates like the minimum volume ellipsoid it becomes more robust. However, 
since it is constant on ellipsoids around the center, Mahalanobis depth cannot reflect 
possible asymmetries of the data. Zonoid depth can be efficiently calculated, also in 
larger dimensions, but has the drawback that the deepest point is always the mean, 
which makes the depth non-robust. So, if robustness is an issue, the zonoid depth 
has to be combined with a proper preprocessing of the data to identify possible 
outliers. The location depth is, by construction, very robust but expensive when 
exactly computed in dimensions > 3. As an efficient approach the random Tukey 
depth yields an upper bound on the location depth, where the number of directions 
has to be somehow chosen. 

A depth statistics measures the centrality of a point in the data. Besides ordering 
the data it provides numerical values that, with some depth notions, have an obvi- 
ous meaning; so with the location depth and all WM depths. With other depths, in 
particular those based on distances, the outlyingness function has an obvious inter- 
pretation. 

More research is needed in three fields: Finding multivariate (non-convex, lo- 
cal) depths for the analysis of multimodal distributions, constructing and applying 
depth statistics in more general spaces, developing efficient exact and approximate 
algorithms to calculate depths and central regions in higher dimensions. 
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